A universal assortativity measure for network analysis 
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Characterizing the connectivity tendency of a network is a fundamental problem in network 
science. The traditional and well-known assortativity coefficient is calculated on a per-network 
basis, which is of little use to partial connection tendency of a network . This paper proposes 
a universal assortativity coefficient (UAC), which is based on the unambiguous definition of each 
individual edge's contribution to the global assortativity coefficient (GAC). It is able to reveal the 
connection tendency of microscopic, mesoscopic, macroscopic structures and any given part of a 
network. Applying UAC to real world networks, we find that, contrary to the popular expectation, 
most networks (notably the AS-level Internet topology) have markedly more assortative edges/nodes 
than dissortaive ones despite their global dissortativity. Consequently, networks can be categorized 
along two dimensions-single global assortativity and local assortativity statistics. Detailed anatomy 
of the AS-level Internet topology further illustrates how UAC can be used to decipher the hidden 
patterns of connection tendencies on different scales. 

PACS numbers: 89.75.Fb,89.75.Hc,89.20.Hh 
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I. INTRODUCTION 

Network has become a useful and proliferative tool 
in a wide spectrum of research areas, ranging from tra- 
ditional communication and transportation networks to 
more recently emerging networks as complex as online 
social networks and brain networks [ll-il5| . Assortativity 
coefficient is a basic metric that characterizes the con- 
nectivity tendency of a network, i.e., globally, whether 
nodes of similar(or dissimilar) degrees are more likely to 
be connected [161 ]. However, this metric is a macroscop- 
ical property, which becomes useless when microscale or 
mesoscale level analysis is required. In other words, one 
can not tell the exact intra-group or inter-group connec- 
tion tendencies from the per-network assortativity coef- 
ficient. 

Experimental studies have shown that various forms 
of groups are hidden in real networks. These groups 
can take the form of community, motif, clique, etc [19l — 
l2ll ]. Multi-scale, especially mesoscale analysis is very im- 
portant to understand the roles and dynamics of these 
groups [UGH. However, previous studies typically focus 
on the uncovering of these groups within a network, and 
treat isomorphic modular components to be identical. In 
other words, the component is solely studied as a sub- 
graph extracted out of the whole network, totally neglect- 
ing the links connecting this subgraph to other parts of 
the graph. Obviously, this traditional method inevitably 
fails to capture the functional difference between isomor- 
phic modular components. Indeed, functional roles or 
dynamics of a group can only be comprehensively un- 
derstood when it is put in the global context. An im- 



portant distinguishable property is whether the group 
under consideration is assortatively mixed or dissorta- 
tively mixed within its local surroundings, which can 
have quite different influence on the dynamics, e.g., infor- 
mation diffusion/disease spreading [22J, resilience against 
attacks [23j. Fig. [T] gives an illustrative example. In this 
figure, two triangles A and B are located in different 
surroundings. Triangle A is surrounded by high-degree 
nodes, i.e., dissortatively mixed with the outside world, 
whereas triangle B is surrounded by low-degree nodes, 
i.e., assortatively mixed with the outside world. This 
causes A and B to behave quite differently in the pro- 
cess of information or disease diffusion. In this simple 
example, suppose SIR model is used to model a disease 
spreading process and the infectious probability p is set 
to 0.5. If A serves as the source of the spreading process, 
the expected number of infected nodes accounts for about 
23% of all the nodes, in contrast, if B serves as the source, 
then only less than 4% of the nodes are expected to be in- 
fected. This drastic discrepancy apparently comes from 
the difference in the connectivity tendency between the 
two triangles and their respective outside worlds. 
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Hence, in order to exactly analyze and explore net- 
work structure, which is beneficial to better understand 
the dynamics of complex systems, it is of critical signif- 
icance to perform intra-group or inter-group connection 
tendency measurement in the global context. In this pa- 
per, we propose a universal assortativity coefficient that 
is based on the unambiguous definition of each individual 
edge's contribution to the global assortativity coefficient. 
This metric allows assortativity analysis on any part of 
a network and reveals some hidden network connectivity 
patterns. 
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FIG. 1. An illustrative example showing how local connec- 
tivity pattern differentiate two isomorphic components (ie. A 
and B). Different local connectivity patterns always have dif- 
ferent effects on dynamics, such as disease spreading. 



For uncorrelated network, r=0; when the network is as- 
sortatively mixed, i.e., nodes of similar degrees are more 
likely to get connected, r is positive; when the network 
is dissortatively mixed, i.e., nodes of dissimilar degrees 
tend to connect to each other, r is negative. 

Now considering each individual edge's contribution 
to the network assortativity coefficient r. Denote U q = 
T,jjq(j) to be the expected value of remaining degree, 
then r can be rewritten as: 

r = -^[T, jk jk(e jk - q(j)q(k))} 
q 

<Jjkjke jk - Ug 



&jkjke jk - 'Ejjq(j) - ^kkq(k) + U% 



Vjkjkejk - T,jjY,kkejk - E k kEjjej k + U? 



II. UNIVERSAL ASSORTATIVITY 
COEFFICIENT 



^jkjke-jk - SjS fe (j + k)e jk + U\ 



In order to measure the assortativity of the network 
on different scales, we proposed a uniform metric called 
universal assortativity coefficient that measures the as- 
sortativity of any subset of connections. Simply put, it 
is the summation of each individual edge's contribution 
to the global assortativity coefficient. Hence, we begin 
with our definition of each individual edge's contribution 
to the global assortativity coefficient. 

Before the formal definition, it is necessary to review 
some related concepts discussed by Newman (l6| . For 
simplicity, all the concepts we discuss are based on undi- 
rected networks. With minor or moderate adjustments, 
these concepts can also be applied to directed networks. 
Degree distribution p(k) refers to the probability that 
a randomly chosen node is of degree k. The remaining 
degree distribution q(k) refers to the probability that fol- 
lowing a randomly chosen edge, the remaining degree of 
the reached node is k. Here, the remaining degree is the 
number of edges leaving this node other than the one we 
arrived along. This number is one less than the total 
degree of this node. The normalized distribution q(k) of 
the remaining degree is: 



q(k) = 



(fc + l)p(fc + l) 



(1) 



Joint probability distribution of the remaining degrees 
of two endpoints at either end of a randomly chosen edge 
e.y is the probability that the remaining degrees of two 
endpoints of a randomly chosen edge are i and j. 

Following these definitions, the assortativity coefficient 
r is defined as: 

r = \ [^jkjk{e jk - q(j)q(k))} (2) 

where a q is the standard deviation of the remaining de- 
gree distribution q(k). 



Z jk (Jk-(j + k)U q + U?)e jk 



- U q )(k - U q )e jk 



E(J - U q )(K - U q ) 



where J and K are variables of the remaining degree, 
which have the same expected value U q . Following the 
above equation, we see that each edge's contribution to 
r is : 



Pe = 



U-u q )(k-u q ) 



Ma^ q 



(3) 



where M is the number of edges, and j, k are the remain- 
ing degrees of the two endpoints of edge e. It is easy to 

see that r = J2iLi Pe- 

When the network is completely homogeneous, i.e., all 
nodes have the same degree, then <r q = 0. In this case p e 
becomes undefinable. Since in this case, each edge has 
the same contribution to r, we define p e to be jj. 

If p e > 0, then e is called an assortative edge; other- 
wise if p e < 0, it is called a dissortative edge. In this 
definition, if both the endpoints' remaining degrees are 
greater (or less) than the global expected remaining de- 
gree U q , then the edge is assortative, and the more the 
two endpoints' remaining degrees deviate from U q , more 
assortative the edge is. Otherwise, the edge is dissorta- 
tive. In other words, the edge assortativeness is a scaled 
difference between the two endpoints' remaining degrees 
and the global expected remaining degree. The absolute 
value of the contribution \p e \ is termed as the assorta- 
tive/dissortative strength of the corresponding edge. We 
define S ae to be the average strength of assortative edges, 
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and Sde to be the average strength of dissortative edges. 
The ratio of assortative edges is denoted by P(p e > 0). 

Finally, the universal assortativity coefficient for a tar- 
geted edge set E target is defined as: 



eEEtarg 



Pe = 



(j-U q )(k-U g ) 



eeE target 



(4) 



Based on this metric, it is easy to measure the assor- 
tativity on different scales. For example, to measure the 
connectivity tendency of a single node, denoted as p v , 
simply set Etarget to be the edges emanating from the 
node. If p v > 0, then we call v to be an assortative node, 
otherwise if p v < 0, we call v to be a dissortative node. 
To measure the connectivity tendency within a group, 
is set to the edges within this group. If we set 



target 
target 



to be the whole edge set E, then we arrive at the 
Newman's global assortativity coefficient [f6j]. In order 
to measure the connectivity tendency between groups, 
simply set Etarget to be the edges between the groups. 
In this sense, this metric can be used to measure connec- 
tion tendencies on different scales, thus, it deserves the 
name uniform assortativity coefficient (UAC). 

Back to our example in Fig. [TJ the global assortative 
coefficient p is -0.804, indicating strong dissortativity. 
However, this global knowledge is of little use to under- 
stand the functional roles of local components, such as A 
and B. Based on UAC, we can quantitatively measure 
the connection tendency between A and the remaining 
graph, as well as between B and the remaining graph. 
It turns out that the inter-group assortative coefficient 
between A and the remaining graph is -0.033, whereas 
the inter-group assortative coefficient between B and the 
remaining graph is 0.025. As a consequence, although A 
and B are isomorphic when they are extracted out of the 
graph, their different connectivity tendencies to the other 
part of the graph result in drastic discrepancy in the dis- 
ease spreading process. This example clearly tells us the 
significance of partial connection tendency for network 
analysis. 



III. REAL NETWORK ANALYSIS 

We apply the UAC analysis to various real- world net- 
works. Table. Q] reports r, P(p e > 0), S ae , Sde and 
P{p v > 0) for different kinds of networks. These net- 
works can be roughly categorized as five kinds: technical 
networks, biological networks, social networks, online so- 
cial networks, and synthesized networks. 

From this table, we see: 

1. For a majority of real networks considered in this 
paper, e.g., AS, Router, Email-Enron, despite their 
impressive global dissortativity, we surprisingly 
find that the number of assortative edges/nodes ex- 
ceeds dissortative edges/nodes. Whereas for the 
synthesized ER network, the number of assorta- 
tive edges almost equals that of dissortative edges, 



and their average strengths are indistinguishable as 
well. Hence, the network as a whole has no mixing 
pattern. 

2. The global network assortativity is determined by 
both the ratio of assortative edges and the strength 
of these edges. For instance, in SCN, both the ratio 
of assortative edges and the average strength of as- 
sortative edges are greater than dissortative edges, 
hence it exhibits strong assortativeness as a whole. 
In comparison, though the number of assortative 
edges in the AS network also exceeds dissortative 
ones, the average strength of assortative edges is 
much weaker than dissortative ones. Hence, the 
dissortativity of this network comes from the rela- 
tively stronger strength of smaller number of dis- 
sortative edges. This is true for quite a number of 
other dissortative networks. 

3. Here we reconfirm the fact that online social net- 
works are dissortatively mixed, whereas real-world 
social networks are assortatively mixed 36]. We 
observe that the ratio of assortative edges in on- 
line social networks are comparatively lower than 
that of real-world social networks, although the to- 
tal number of assortative edges still exceeds dis- 
sortative ones. However, the average strength of 
dissortative edges is greater than that of assorta- 
tive edges in online social networks, in contrast, 
in real-world social networks, the situation is just 
the opposite. This reflects the fact that online so- 
cial networks can to some extent eliminate social 
barrier between people of different social positions, 
making it is much easier for people at the bottom of 
society to setup links to people at the top of society. 

4. According to global assortativity and local edge as- 
sortativity statistics, networks can be categorized 
to four kinds: globally assortative with leading 
number of assortative edges, globally assortative 
but with leading number of dissortative edges, glob- 
ally dissortative with leading number of dissorta- 
tive edges, globally dissortative but with leading 
number of assortative edges. Table |TT] categorizes 
the networks along the two dimensions. Yet, it still 
remains an open question whether there is a real 
network that exhibits global assortativety but pri- 
marily consists of dissortative edges. 

In the following, we use the AS-level Internet topology 
as an example to illustrate how the universal assortativ- 
ity coefficient can be used to calculate the connectivity 
tendency of intra-group or inter-group connections. In 
the AS-level topology, a natural group partition of clear 
and explicit meaning is to partition the ASes according 
to their geographical regions. Today, five regional In- 
ternet registries (RIR) are managing the allocation and 
registration of Internet number resources (including AS 
numbers) within a particular region of the world. The 
five RIRs are: AfriNIC for Africa, ARIN for the United 
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TABLE I. Connection tendencies for different categories of networksQ 



Category 


Name 


r 


^(pe > 0) 


Q 


rt 

<->de 


Til ~~~ f\\ 

P(p v > 0) 


Technical Network 


AS-2011-6 [24] 

Router [10] 
UbAir |25, zo] 


-0.184 
-0.138 
-0.208 


60.4% 
51.3% 
42.1% 


1.96 x 10"° 
5.80 x 10~ 5 
2.76 X 10 


8.71 x 10" b 
1.07 x 10" 4 
3.70 x 10 


58.3% 
57.5% 
37.3% 


Biological Network 


PPI [27] 
celegansneural [28, 291 
iooQWGD_r loncia |zo, ouj 


-0.102 

-U.lOO 

-u. nz 


50.5% 

00. i/O 

4y.D /o 


6.92 x 10" b 

i ii w i n — 4 
1.11 X 1U 

i an -v i r\— 4 


1.02 x 10" 4 
o.zl X 1U 
z.y4 x iu 


52.6% 

A O 1 0/ 
4:2.170 

OA A 07 
04.4/0 


Social Network 


SCN [11, 12} 
CA-HepTh [31] 
CA-GrOc [31J 


0.161 
0.268 
0.659 


61.4% 
63.1% 
80.6% 


1.33 x 10"° 
2.78 x 10" 5 
6.32 x 10" 5 


1.07 x 10" a 
1.96 x 10" 5 
2.78 x 10" 5 


69.7% 
72.8% 
88.9% 


Online Social Network 


soc-Epinionsl [32] 
Email-Enron [33, 34] 


-0.041 
-0.111 


58.8% 
58.9% 


5.81 x 10"'' 
1.21 x 10" 6 


1.07 x 10"" 
3.19 x 10" 6 


71.6% 
51.5% 


Synthesized Network 


ER [35] 


-0.001 


50.7% 


1.24 x 10~ b 


1.27 x 10 _b 


50.4% 



a The result of ER network is an average over 10 times. We treat Soc-Epinionsl, celegansneural and foodweb_Florida, originally directed 
networks, as undirected networks by treating each directed edge as an undirected one and eliminating duplicated edges. 



TABLE II. Categorization of networks by global assortativity 
and edge assortativity statistics. 





r > 


r < 


P(p e > 0) > 50% 


SCN 
CA-HepTh 
CA-GrOc 


AS, Router 
soc-Epinionsl 
Email-Enron 


P(p e > 0) < 50% 




USAir 
foodweb_Florida 




FIG. 2. (Color Online) Map of regional Internet registries. 

States, Canada, several parts of the Caribbean region and 
Antarctica, APNIC for Asia, Australia, New Zealand, 
and neighboring countries, LACNIC for Latin America 
and parts of the Caribbean region, and RIPENCC for 
Europe, the Middle East and Central Asia (see Fig. El for 
a graphical representation of the five RIRs' responsible 
regions). This gives us a coarse partition of the ASes ac- 
cording to the five regions. A more fine-grained partition 
is to further divide each region according to countries 
and regions. Hence, we have a two-level partitioning. 
The first-level groups consist ASes adhering to the same 
regional Internet registries, and the second-level groups 
consist ASes belonging to the same country and region, 
following the ISO 3166-1 standard. 

Table. IIIII reports both the intra-RIR and inter-RIR 
assortativity coefficients. We observe that except for 
ARIN, other RIRs all show assortativity internally. For 
inter-RIR connections, we observe that connections be- 



tween ARIN and all other RIRs show dissortativity. 
RIPENCC exhibits similar phenomenon with ARIN ex- 
cept that its connections with AfriNIC exhibits some sort 
of assortativity. Connections among AfricNIC, APNIC, 
and LACNIC, all show assortativity. This connectivity 
tendency reflects the fact that broadly, the regions cov- 
ered by ARIN and RIPENCC are the core of the Inter- 
net. However, RIPENCC differs from ARIN in the sense 
that RIPENCC itself is assortative whereas ARIN is dis- 
sortative. This could be more appropriately explained 
by the more fine-grained country and region connection 
tendencies. Fig. [3] reports the intra- and inter-country 
and region assortativity coefficients for those countries 
and regions whose observed ASN numbers are greater 
than 80 (we choose 80 as a threshold because we want 
to ensure that each RIR has at least one country or re- 
gion in this map). In this figure, vacant grid means there 
is no observed AS connections between the two coun- 
tries/regions. Different colors are used to discretize the 
strength of assortativity /dissortativity within and be- 
tween countries/regions. Several clear patterns can be 
observed from this plot. Firstly, except for US, all other 
countries/regions are internally assortatively mixed, as il- 
lustrated by the diagonal of the plot. Secondly, there are 
a few countries/regions, namely, US, CA, GB, EU, DE, 
that primarily show dissortative connectivity tendencies 
to other countries/regions. Finally, inter-connections be- 
tween other countries/regions are mostly assortative. 

Statistically, on the RIR scale, we found that about 
67.3% intra-RIR edges are assortative, whereas only 
32.3% inter-RIR edges are assortative. And on the 
country/region scale, 69.7% intra-country /region edges 
are assortative, whereas only 44.9% inter-country /region 
edges are assortative. Considering the fact that glob- 
ally an average of 60.4% edges are assortative, it is then 
apparent that on both scales, edges within the same re- 
gional area are more likely to be assortative than the 
average ratio 60.4%, whereas, edges linking different re- 
gional areas are far less likely to be assortative than the 
average ratio. This locality-driven difference in connec- 
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TABLE III. Intra-RIR and inter-RIR assortativity coefficients. 



size 




AfriNIC 




APNIC 


LACNIC 


RIPENCC 


ARIN 


380 


AfriNIC 


9.35 x 1(T 


4 


3.86 x 10~ b 


1.13 x 10 _b 


1.24 x 10 -4 


-0.002 


3711 


APNIC 


3.86 x 10" 


5 


0.014 


1.48 x 10~ 4 


-2.93 x 10~ 4 


-0.01 


1209 


LACNIC 


1.13 x 10" 


5 


1.48 x 10 -4 


0.004 


-4.61 x 10" 4 


-0.007 


13401 


RIPENCC 


1.24 x 10" 


4 


-2.93 x 10" 4 


-4.61 x 10~ 4 


0.021 


-0.083 


11172 


ARIN 


-0.002 




-0.01 


-0.007 


-0.083 


-0.121 
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FIG. 3. (Color online) Connection tendencies between and within different countries and regions. 



tivity patterns is a characteristic feature of AS-level In- 
ternet topology, which however, cannot be revealed by 
the global assortativity coefficient. 



IV. DISCUSSION AND CONCLUSION 

Prior to our definition, local assortativity coefficient is 
proposed as a local metric [37], |38| that measures the in- 
dividual node's connection tendency, which is defined by 
calculating the contribution of each node to the global 
assortativity coefficient. However, the calculation is ar- 
guable because there is no precise and unique way to de- 
tcrministically quantify each node u's contribution to a 



combined term U? collectively calculated from the edge 
set. For example, supposing the remaining degree of a 
node v is j, there may be many forms of the contribution 

of v, such as (j/ J2vev j) * U q> 0'V Et>ev f) * U q and so 
on. None of these forms can justify itself. This is because 
calculation of is a unified process, which is closely re- 
lated to the complex correlation of the network structure, 
so we could not decompose this term into each node's 
contribution as if nodes were independent of each other. 
In contrast, our definition is more straightforward in that 
it calculates each edge's contribution to the global assor- 
tativity coefficient, rather than each node's contribution 
to a term in the formula. As a result, our definition com- 
pletely avoids the bias issue in that definition |38( . More 
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TABLE IV. Country and region codes for corresponding IDs 
in Fig. |31 and the number of ASes owned by these countries 
and regions. 



RIR name 


ID 


country and region code 


number of ASes 


AfriNIC 





ZA 


95 




1 


AU 


586 




2 


KR 


539 




3 


JP 


483 




4 


ID 


339 




5 


IN 


306 




6 


HK 


194 


APNIC 


7 

8 


CN 
TH 


166 
162 




9 


NZ 


151 




10 


SG 


123 




11 


PH 


118 




12 


TW 


100 




13 


BD 


85 


ARIN 


14 
15 


US 
CA 


10406 
674 




16 


BR 


574 


LACNIC 


17 


AR 


141 




18 


MX 


133 




19 


RU 


2544 




20 


UA 


1146 




21 


GB 


1059 




22 


EU 


1016 




23 


DE 


904 




24 


PL 


882 




25 


CZ 


505 




26 


FR 


431 




27 


IT 


423 




28 


BG 


349 




29 


NL 


348 




30 


CH 


324 


RTPFNPP 


31 
32 


SE 
AT 


306 
268 




33 


RO 


243 




34 


ES 


213 




35 


TR 


170 




36 


LV 


150 




37 


IL 


149 




38 


DK 


141 




39 


SI 


125 




40 


HU 


123 




41 


IR 


117 




42 


FI 


111 




43 


BE 


110 




44 


NO 


105 



importantly, from the edge assortativity, we can define 
the universal assortativity coefficient capable of network 
analysis. 

To summarize, we present a universal assortativity co- 
efficient (UAC) which can be used to calculate connection 
tendencies on any part of a network, such as commu- 
nities, groups in multiple network scales. Indeed, given 
that the target edge set is set to all edges, UAC is exactly 
the global assortativity coefficient (GAC). In this sense, 
GAC is a special case of UAC. Moreover, this definition is 
deterministic, completely avoiding the bias issue accom- 
panied with the node-based local assortativity coefficient 
definition. UAC helps to uncover individual, partial, and 
global assortativity patterns in various networks. Apply- 
ing UAC to real world networks, we find that contrary to 
the popular expectation, most globally dissortative net- 
works are still dominated by assortative edges, though 
with weak strength. This observation also motivates us 
to classify networks along two dimensions into four cat- 
egories, characterized by their global assortativity coef- 
ficient and local assortativity statistics. It is expected 
that this measure can be widely applied to various net- 
works such as popular online social networks, ubiquitous 
modern communication networks and transportation net- 
works, help people uncover more hidden patterns in net- 
works, and finally allow deep understanding of network 
dynamics caused by the structural difference discerned 
by the UAC. 
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